Dynamic Summary

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Our data was very large, so we chose to focus on a few numbers that stood out to us. First, we found the total number of observations: 10199. This it is important in understanding and verifying that the data is accurate and that its findings can be applied to a greater population. Next, we looked at the number of people who used drugs while they gambled. We found this to be 717 observations from the data. This directly leads into our question about whether people who do drugs and drink alcohol, of which there were 3276 observations for alcohol users, are correlated with those more susceptible to gambling addiction. Next, we decided to look at the number of people with immediate family who have gambling problems, of which we found 1297 observations. We found this number to be greater than we thought; the ratio of people had immediately family who also had gambling problems was higher than we expected. Finally, we looked at the health problems of those who suffered from gambling addiction. We looked at how many people had any of the following conditions: (1) physical illness, (2) mental illness including depression, anxiety, PTSD, etc., (3) drug or alcohol addiction. There were 1128 total people who suffered from any of those problems. Again, we found this number of people to be surprisingly high, which made us want to look further at the data around alcohol/drug addiction and gambling addiction.

Summary Tables

## Warning: package 'tidyverse' was built under R version 4.2.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.5
## ✔ tibble  3.1.8     ✔ stringr 1.4.1
## ✔ tidyr   1.2.1     ✔ forcats 0.5.2
## ✔ readr   2.1.3     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::complete() masks RCurl::complete()
## ✖ dplyr::filter()   masks stats::filter()
## ✖ dplyr::lag()      masks stats::lag()
##   total completed missing completion_rate
## 1 10199      4707    5492          0.4615
## # A tibble: 2 × 3
## # Groups:   Drug status [2]
##   `Drug status` number_base percentage_base
##           <int>       <int>           <dbl>
## 1             0        9599           94.1 
## 2             1         600            5.88
## # A tibble: 2 × 4
## # Groups:   Drug status [2]
##   `Drug status` number_follow percentage_follow whole_percentage
##           <int>         <int>             <dbl>            <dbl>
## 1             0          4488             95.4             44   
## 2             1           219              4.65             2.15
## # A tibble: 2 × 6
## # Groups:   Drug status [2]
##   `Drug status` number_base percentage_base number_follow percentage_f…¹ whole…²
##           <int>       <int>           <dbl>         <int>          <dbl>   <dbl>
## 1             0        9599           94.1           4488          95.4    44   
## 2             1         600            5.88           219           4.65    2.15
## # … with abbreviated variable names ¹​percentage_follow, ²​whole_percentage
## # A tibble: 7 × 3
## # Groups:   lottery status [7]
##   `lottery status` number_base percentage_base
##              <int>       <int>           <dbl>
## 1                0         441            4.32
## 2                1        1344           13.2 
## 3                2        1726           16.9 
## 4                3        2341           23.0 
## 5                4        2904           28.5 
## 6                5        1222           12.0 
## 7                6         221            2.17
## # A tibble: 7 × 4
## # Groups:   lottery status [7]
##   `lottery status` number_follow percentage_follow whole_percentage
##              <int>         <int>             <dbl>            <dbl>
## 1                0           257              5.46             2.52
## 2                1           837             17.8              8.21
## 3                2           587             12.5              5.76
## 4                3           958             20.4              9.39
## 5                4          1323             28.1             13.0 
## 6                5           604             12.8              5.92
## 7                6           141              3                1.38
## # A tibble: 7 × 6
## # Groups:   lottery status [7]
##   `lottery status` number_base percentage_base number_follow percentag…¹ whole…²
##              <int>       <int>           <dbl>         <int>       <dbl>   <dbl>
## 1                0         441            4.32           257        5.46    2.52
## 2                1        1344           13.2            837       17.8     8.21
## 3                2        1726           16.9            587       12.5     5.76
## 4                3        2341           23.0            958       20.4     9.39
## 5                4        2904           28.5           1323       28.1    13.0 
## 6                5        1222           12.0            604       12.8     5.92
## 7                6         221            2.17           141        3       1.38
## # … with abbreviated variable names ¹​percentage_follow, ²​whole_percentage

Charts

## Warning: package 'plotly' was built under R version 4.2.2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

In order to determine whether gambling tendencies are correlated with drug and alcohol usage, we decided to make a chart to see the relationship between gambling tendencies and drug usage. This chart is color encoded so the top portion of the stacked bar chart are people who always used drugs while they gambled, and the bottom portion of the stacked bar chart is people who never used drugs while they gambled. There seems to be a correlation between people who gamble more and drug usage. People with no debt from gambling, or those who gambled the least out of those surveyed, reported a significantly less amount of drug usage while they gambled, while the people who had more debt from gambling reported a larger drug usage.

In order to determine whether personality has an effect on compulsive gamblers, we decided to see if there is a relationship between gambling tendencies and impulsiveness. The measure for impulsiveness was taken from the NEO Personality Index. For the gambling survey, researchers put the “Impulsiveness” measure section of the NEO index to measure a person’s impulsiveness to see if there is a correlation between impulsiveness as a personality trait and tendencies to gamble. The chart above shows the amount of self-reported debt that each gambler has as well as their impulsiveness measured on a scale of 0 points to 32 points, with 32 being the most impulsive and 0 being the least impulsive. This chart shows that while there maybe some small correlations between impulsiveness and tendencies to gamble, the correlation was likely small. The average peak of the frequencies of people in each gambling category was in a relatively similar impulsiveness scale, and even those who had more debt still didn’t report a significantly higher number on their scores for impulsiveness.

This chart shows Alcohol usage vs gambler debt. (We are a 3 person group, so this is just an extra graph for fun.)